Do more words have left or right NN-co-occurrences? Does the answer depend on the frequency range?
The following table shows the ratio of the number of words having (at least one) left co-occurrence compared to the number of words having (at least one) right co-occurrence. We only consider words of rank 1 … N.
The data show a slight increase of the ratio for larger N. I.e., for larger N, there are slightly more words having left neighbors.
For second table line:
select 100,(select count(distinct(w2_id)) from co_n where w2_id>100 and 200>w2_id) / (select count(distinct(w1_id)) from co_n where w1_id>100 and 200>=w1_id);
The ration clearly depends on the lower boundary of the significance value used for the co-occurrence data. But in what way?
5.1.9.4 Skewness in NN co-occurrences IV